Winter 2026
Joint posterior
\[ \pi(\boldsymbol{\theta} \mid \boldsymbol{x}) = \frac{f(\boldsymbol{x} \mid \boldsymbol{\theta}) \, \pi(\boldsymbol{\theta})}{\int_\boldsymbol{\Theta} f(\boldsymbol{x} \mid \boldsymbol{\theta}) \, \Pi(\mbox{d} \boldsymbol{\theta}) } \]
Marginal posterior
Splitting up the parameter \(\boldsymbol{\theta} = (\boldsymbol{\theta}_1, \boldsymbol{\theta}_2)\), we can write \[\begin{align} \pi(\boldsymbol{\theta}_1 \mid \boldsymbol{x}) &= \int_{\boldsymbol{\Theta}_2} \Pi( \boldsymbol{\theta}_1, \, \mbox{d} \boldsymbol{\theta}_2 \mid \boldsymbol{x}) \\ &= \int_{\boldsymbol{\Theta}_2} \pi( \boldsymbol{\theta}_1 \mid \boldsymbol{\theta}_2, \boldsymbol{x}) \, \Pi_2(\mbox{d} \boldsymbol{\theta}_2 \mid \boldsymbol{x}) \, . \end{align}\]
Prior choices:
Assuming \(x_1, \ldots, x_n \overset{\text{iid}}{\sim} \mbox{Normal}(\mu, \sigma^2)\),
\(f(\boldsymbol{x} \mid \mu, \sigma^2) =\)
Noninformative prior
\(\pi(\mu, \sigma^2) \propto \frac{1}{\sigma^2}\)
This is equivalent to using \(\pi(\mu, \log\sigma) \propto 1\), and is the Jeffreys prior.
Then we can write \(\pi(\mu, \sigma^2 \mid \boldsymbol{x}) = \pi(\sigma^2 \mid \boldsymbol{x}) \times \pi(\mu \mid \sigma^2, \boldsymbol{x})\)
(See normal-inverse-gamma distribution)
where
\(\pi(\sigma^2 \mid \boldsymbol{x}) \propto\)
Prior: \(\pi(\mu, \sigma^2) \propto \frac{1}{\sigma^2}\)
Posterior: \(\pi(\mu, \sigma^2 \mid \boldsymbol{x}) = \pi(\sigma^2 \mid \boldsymbol{x}) \times \pi(\mu \mid \sigma^2, \boldsymbol{x})\)
with
\(\sigma^2 \mid \boldsymbol{x} \sim \mbox{Inv-Gamma}(\frac{n-1}{2}, \frac{(n-1)s^2}{2})\)
and
\(\pi(\mu \mid \sigma^2, \boldsymbol{x}) \propto\)
Prior: \(\pi(\mu, \sigma^2) \propto \frac{1}{\sigma^2}\)
Posterior: \(\pi(\mu, \sigma^2 \mid \boldsymbol{x}) = \pi(\sigma^2 \mid \boldsymbol{x}) \times \pi(\mu \mid \sigma^2, \boldsymbol{x})\)
with \[\begin{align} \sigma^2 \mid \boldsymbol{x} &\sim \mbox{Inv-Gamma}\left(\frac{n-1}{2}, \, \frac{(n-1)s^2}{2} \right) \, , \\ \mu \mid \sigma^2, \boldsymbol{x} &\sim \mbox{Normal}\left(\bar{x}, \, \frac{\sigma^2}{n} \right) \, . \end{align}\]
This suggests sequentially sampling \((\mu, \sigma^2)\) from the posterior with the compositional method.
Proposition
Gamma, Inv-Gamma relation
If \(X \sim \mbox{Gamma}(a, \, \texttt{rate}=b)\),
then \(Y = \frac{1}{X} \sim \mbox{Inv-Gamma}(a, \, \texttt{scale} = b)\)
Prior: \(\pi(\mu, \sigma^2) \propto \frac{1}{\sigma^2}\)
Posterior: \(\pi(\mu, \sigma^2 \mid \boldsymbol{x}) = \pi(\sigma^2 \mid \boldsymbol{x}) \times \pi(\mu \mid \sigma^2, \boldsymbol{x})\).
If interest lies only in \(\mu\), we can find
\(\pi(\mu \mid \boldsymbol{x}) = \int_0^\infty \pi(\sigma^2 \mid \boldsymbol{x}) \times \pi(\mu \mid \sigma^2, \boldsymbol{x}) \, \mbox{d}\sigma^2 \, ,\)
which is \(\frac{s}{\sqrt{n}} t_{n-1} + \bar{x}\),
where \(t_{\text{df}}\) denotes a Student-\(t\) distribution with \(\text{df}\) degrees of freedom.
Posterior predictive
\(f(\tilde{x} \mid \boldsymbol{x}) = \int_0^\infty \int_{-\infty}^\infty f(\tilde{x} \mid \mu, \sigma^2) \, \pi(\mu, \sigma^2 \mid \boldsymbol{x}) \, \mbox{d} \mu \, \mbox{d}\sigma^2\)
which is \[s\sqrt{1 + \frac{1}{n}} \times t_{n-1} + \bar{x} \, ,\]
that is, location-scale Student-\(t\) with \(n-1\) degrees of freedom.
Fully conjugater prior
\(\pi(\mu, \sigma^2) = \pi(\sigma^2) \times \pi( \mu \mid \sigma^2)\)
with \[\begin{align} \sigma^2 &\sim \mbox{Inv-Gamma}\left( \frac{n_0}{2}, \, \frac{n_0 s_0^2}{2}\right) \, , \\ \mu \mid \sigma^2 &\sim \mbox{Normal}\left(m_0, \, \frac{\sigma^2}{\kappa_0}\right) \, . \end{align}\]
Note that this is equivalent to the normal-inverse-gamma distribution with density
\(\pi(\mu, \sigma^2) \propto\)
\(n_0 = 5\), \(s_0^2 = 1\), \(m_0 = 0\), \(\kappa_0 = 3\)
Prior \(\pi(\mu, \sigma^2) = \mbox{Inv-Gamma}\left(\sigma^2; \, \frac{n_0}{2}, \, \frac{n_0 s_0^2}{2} \right) \times \mbox{Normal}\left( \mu ; \, m_0, \, \frac{\sigma^2}{\kappa_0}\right)\)
\(n_0\):
\(s_0^2\):
\(m_0\):
\(\kappa_0\):
Posterior \(\pi(\mu, \sigma^2 \mid \boldsymbol{x}) = \mbox{Inv-Gamma}\left(\sigma^2; \, \frac{n_1}{2}, \, \frac{n_1 s_1^2}{2} \right) \times \mbox{Normal}\left( \mu ; \, m_1, \, \frac{\sigma^2}{\kappa_1}\right)\)
\(n_1 =\)
\(n_1 s_1^2 =\)
\(m_1 =\)
\(\kappa_1 =\)
Posterior \(\pi(\mu, \sigma^2 \mid \boldsymbol{x}) = \mbox{Inv-Gamma}\left(\sigma^2; \, \frac{n_1}{2}, \, \frac{n_1 s_1^2}{2} \right) \times \mbox{Normal}\left( \mu ; \, m_1, \, \frac{\sigma^2}{\kappa_1}\right)\)
As with the noninformative case, \((\mu, \sigma^2)\) can be sampled sequentially with the compositional method.
The marginal posterior of \(\mu\) is \(\sqrt{\frac{n_1 \, s_1^2}{n_1 \, \kappa_1}} \times t_{n_1} + m_1\),
that is, location-scale Student-\(t\) with \(n_1\) degrees of freedom.
Posterior predictive
\(f(\tilde{x} \mid \boldsymbol{x}) = \int_0^\infty \int_{-\infty}^\infty f(\tilde{x} \mid \mu, \sigma^2) \, \pi(\mu, \sigma^2 \mid \boldsymbol{x}) \, \mbox{d} \mu \, \mbox{d}\sigma^2\)
which is \[\sqrt{\left( \frac{n_1 \, s_1^2 (\kappa_1 + 1)}{n_1 \, \kappa_1} \right)} \times t_{n_1} + m_1 \, ,\]
that is, location-scale Student-\(t\) with \(n_1\) degrees of freedom.
Conditionally conjugate prior
\(\pi(\mu, \sigma^2) = \mbox{Normal}(\mu; \, m_0, \, v_0) \times \mbox{Inv-Gamma}\left(\sigma^2; \, \frac{a_0}{2}, \, \frac{b_0}{2} \right)\)
Posterior complete (or full) conditionals
\(\mu \mid \sigma^2, \boldsymbol{x} \sim \mbox{Normal}(m_1, v_1)\) with \[v_1 = \frac{1}{\frac{1}{v_0} + \frac{n}{\sigma^2}} \quad \text{and} \quad m_1 = v_1 \times \left(\frac{1}{v_0} m_0 + \frac{n}{\sigma^2} \bar{x} \right) \, ,\]
and
\(\sigma^2 \mid \mu, \boldsymbol{x} \sim \mbox{Inv-Gamma}\left(\frac{a_1}{2}, \, \frac{b_1}{2} \right)\) with \[a_1 = a_0 + n \quad \text{and} \quad b_1 = b_0 + \sum_{i=1}^n (x_i - \mu)^2 \, .\]
Uniform scale prior
\(\pi(\mu, \sigma) = \mbox{Normal}(\mu; \, m_0, \, v_0) \times \mbox{Uniform}(\sigma; \, 0, \, b_\sigma)\)
Half-Cauchy scale prior
\(\pi(\mu, \sigma) = \mbox{Normal}(\mu; \, m_0, \, v_0) \times \mbox{Cauchy}^+(\sigma; \, s_0)\)
where
\(\mbox{Cauchy}^+(\sigma; \, s_0) = \frac{2}{\pi s_0} \left( 1 + \frac{\sigma^2}{s_0^2} \right)^{-1} \, 1\{ \sigma > 0 \}\)
is a scaled Student-\(t\) with 1 degree of freedom, truncated below at 0.
The prior \(\pi(\sigma) = \mbox{Cauchy}^+(\sigma; \, s_0)\) is equivalent to
\[\begin{align} \sigma^2 \mid \eta &\sim \mbox{Inv-Gamma}\left(\frac{1}{2}, \, \frac{1}{\eta}\right) \, , \\ \eta &\sim \mbox{Inv-Gamma}\left(\frac{1}{2}, \, \frac{1}{s_0^2}\right) \, . \end{align}\]
Complete conditional distributions:
\[\begin{align} \sigma^2 \mid \eta, \, \mu, \, \boldsymbol{x} &\sim \mbox{Inv-Gamma}\left(\frac{n+1}{2}, \, \frac{\sum_{i=1}^n (x_i - \mu)^2}{2} + \frac{1}{\eta} \right) \, , \\ \eta \mid \sigma^2, \, \mu, \, \boldsymbol{x} &\sim \mbox{Inv-Gamma}\left(1, \, \frac{1}{\sigma^2} + \frac{1}{s_0^2}\right) \, . \end{align}\]
Example
Consider the brass alloy zinc data described in Section 1.2 of BIDA and available at https://blogs.oregonstate.edu/bida/data-sets-and-code/
Assume the model \(x_i \overset{\text{iid}}{\sim} \mbox{Normal}(\mu, \sigma^2)\) for \(i = 1, \ldots, n\) with \(n=12\) and perform an analysis with